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Abstract. It is becoming increasingly apparent that probabilistic approaches 
can overcome conservatism and computational complexity of the classical worst- 
case deterministic framework and may lead to designs that are actually safer. 
In this paper we argue that a comprehensive probabilistic robustness analysis 
requires a detailed evaluation of the robustness function and we show that 
such evaluation can be performed with essentially any desired accuracy and 
confidence using algorithms with complexity linear in the dimension of the 
uncertainty space. Moreover, we show that the average memory requirements 
of such algorithms are absolutely bounded and well within the capabilities of 
today's computers. 

In addition to efficiency, our approach permits control over statistical sam- 
pling error and the error due to discretization of the uncertainty radius. For 
a specific level of tolerance of the discretization error, our techniques provide 
an efficiency improvement upon conventional methods which is inversely pro- 
portional to the accuracy level; i.e., our algorithms get better as the demands 
for accuracy increase. 



1. Introduction 

In recent years, a number of researchers have proposed probabiHstic control 
methods for overcoming the computational complexity and conservatism of the 
deterministic worst-case robust control framework (e.g., [II]~[inj and the 

references therein). 

The philosophy of probabilistic control theory is to sacrifice cases of extreme 
uncertainty. Such paradigm has lead to the concept of confidence degradation func- 
tion (originated by Barmish, Lagoa and Tempo p]), which has demonstrated to be 
extremely powerful for the robustness analysis of uncertain systems. Such function, 
^(.), is defined as ^(r) = info<p<rF(p) with 

P{p) = vol{X e Bp I The robustness requirement is guaranteed for X} /vol{Bp} 

where the volume function vol{.} is the Lebesgue measure, and Bp denotes the 
uncertainty bounding set with radius p. Interestingly, it was discovered in [2 that 
such function is not necessarily monotone decreasing in the uncertainty radius. In 
view of this fact and for the purpose of avoiding the confusion with the concept 
of confidence band, used in the evaluation of the accuracy of the estimate of P(r), 
the confidence degradation function is referred to as the robustness function in this 
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paper. Accordingly, a graph representation of the robustness function is called the 
robustness curve. It can be seen that the robustness function is a natural extension 
of the concept of robustness margin. From the robustness curve, one can determine 
the probabilistic robustness margin [2j and estimate the deterministic robustness 
margin. 

In addition to overcoming the NP hard complexity and conservatism of determin- 
istic robustness analysis methods, the robustness function can address very complex 
problems which are intractable by deterministic worst-case methods. Moreover, the 
probability that the robustness requirement is guaranteed can be inferred from the 
robustness function, while the deterministic margin losses the connection with such 
probability. Based on the assumption that the density function of uncertainty is 
radially symmetric and non-increasing with respect to the norm of uncertainty, it 
has been shown in [2 that the probability that the robustness requirement is guar- 
anteed is no less than J^{r) — infp£(Q^] ^{p) when the uncertainty is included in a 
bounding set with radius r. The underlying assumption is supported by modeling 
and manufacturing considerations that the uncertainty is unstructured so that all 
directions are equally likely and that small perturbations from the nominal model 
are more likely than large perturbations. Since P(.) is not monotonically decreasing 
[2], the lower bound of the probability depends on F{p) for all p £ (0,r]. It is not 
clear whether it is feasible to estimate ^(r) since the estimation of P{p) for every 
p relies on intensive Monte Carlo simulation and P(p) needs to be estimated for 
numerous values of p. For such probabilistic method to overcome the NP hard of 
worst-case methods, it is necessary to show that the complexity for estimating ^(r) 
for a given r is polynomial in terms of computer running time and memory space. 
In this paper, we demonstrate that the complexity in terms of space and time is 
surprisingly low and is linear in the uncertainty dimension and the logarithm of the 
relative width of the range of uncertainty radius. 

In the next section we argue that both the deterministic robustness margin and 
its risk-adjusted version - the probabilistic robustness margin have inherent limi- 
tations. We address those limitations through the use of the robustness function 
that can describe the performance of a system over a wide range of uncertainties. 
In order to construct the robustness function for wide range of uncertainty radii, 
the conventional method independently estimate P(ri) for each grid points of un- 
certainty. If there are m grid points and N is the sample size for each radius, then 
the total number of simulations is Nm. In Section [31 we use the sample reuse 
principle and demonstrate that the robustness curve for arbitrarily wide range of 
uncertainty radii can be accurately constructed with surprisingly low complexity. 
Clearly, the number of grid points, to, must tend to infinity as the tolerance tends 
to zero. However, we show that with our algorithms, the equivalent number of grid 
points (ENGP), rrieq, is strictly bounded from above in the sense that in order to 
guarantee the same level of accuracy for the estimation of the robustness function, 
the required average computational effort is the same as that of a conventional grid 
with rrieq points. Moreover, we show that the average memory requirement is also 
absolutely bounded and is well within the reach of modern computers. 

The remainder of the paper is organized as follows. Section [2] provides an exam- 
ple illustrating the pitfalls of deterministic robustness margin and the probabilistic 
robustness margin. Section |4] discusses the control of estimation error of the robust- 
ness function and the required complexity. Section [5] investigates the difficulties of 
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the conventional data structure. Section [6] describes our new algorithms, analyzes 
the complexity of data processing and memory space, and introduces the concept 
of confidence band. The proofs of all the theorems are included in the Appendices. 



2. The Risk of Robustness Margins 

In this section we make the case for the need to have a robustness function 
in order to properly estimate how well a control system tolerates uncertainties. 
Conventional robust control approaches the issue with a "worst case" philosophy. 
In this regard, it has been demonstrated (Chen, Aravena and Zhou, [5]) that it 
is not uncommon for a probabilistic controller to be significantly less risky than 
a deterministic worst-case control. The reasons are the "uncertainty in modeling 
uncertainties" and the fact that the worst-case design cannot, in some instances, 
be "all encompassing." Therefore, the worst-case approach has an associated risk 
that usually is overlooked, while the probabilistic approach acknowledges the risk 
and manages it. 

From manufacturing and modeling considerations, it is sensible to assume that 
the density of the distribution of uncertainty decreases with increasing uncertainty 
norm. Such assumption leads to the worst-case property of uniform distribution in 
robustness analysis [2l. However, the decay rate of density is generally unknown 
to the designer. Therefore, for a given uncertainty radius r, one does not have 
good knowledge about the coverage probability of the uncertainty set Br- It is 
important to note that the system robustness depends critically on the distribution 
of uncertainty norm. 

Attempts to improve the analysis have led to the definitions of a deterministic 
robustness margin and a probabilistic robustness margin. Both are numbers that 
purportedly allow the user to estimate the tolerance to uncertainties. We contend 
that both can be misleading, and for essentially the same reason. To demonstrate 
this view point, we consider a feedback system shown in Figure [TJ 
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Figure 1. Standard Feedback Configuration 



The transfer function of the plant is G{s) = where p and q are uncertain 
parameters. The uncertainty bounding set with radius r > is 

Br = {{x,y): |a;-go|<'', \y'-PQ\<r}, Po < 0, go > 0. 

Consider two controllers Ca = a > and Cb — Kb such that 

1<Kb< , — — ■ <a-pa. 

a Ka + a 



4 



XINJIA CHEN, KEMIN ZHOU AND JORGE L. ARAVENA 



Suppose that the robustness requirement is stabihty. It can be shown that the 
robustness function for controUer A is 



where 



1 for < r < pa; 

for r>,; 

KaQq- cr po 



PA = 



is the deterministic robustness margin, /? = min(CT,po + 1^)1 and = ^"^KA-a ^° 
It can be shown that the robustness function for controUer B is given by 

for < r < Pb; 

y"^) - \ l ^ — ^ — for PB <r < p*b; 

for r>p*^ 



where 



Kb qo - Pq 

PB 



Kb + 1 

is the deterministic robustness margin and p*g = ■ 

We consider an example with po = —10, go = 50, a = 40, Ka = IOOct, Kb ~ 
10. The corresponding robustness functions are displayed in Figure [H We obtained 
deterministic margins pA = 49.6040, pB = 46.3636. Since pA > Pb, a comparison 
based on the deterministic margin simply suggests that controller A is more robust 
than controller B. Quite contrary, a judgement based on the robustness curves 
indicates that controller B may be more robust. The risk of the probabilistic 
robustness margin can also be illustrated by this example. 

Robust analysis should be able to help a designer to reliably determine which 
controller design is more robust. However, it appears that the concepts of robust- 
ness margin fail to meet such fundamental needs of control engineering. On the 
other hand, the robustness curve serves the purpose of giving the designer complete 
information on how well a control system tolerates uncertainties. 

From the previous discussion, it can be seen that there are two crucial factors to 
be considered in order to make a reliable judgment about the system robustness: 

(i) : How fast the robustness curve rolls off. 

(ii) : The dependency of coverage probability of uncertainty bounding set Br 
on the radius r. 

The second factor can be a difficulty since a designer generally lacks knowledge 
of the coverage probability corresponding to a bounding set of fixed radius. To 
overcome such difficulty, the only choice is to construct the robustness curve for a 
wide range of uncertainty radius. The construction of the robustness curve may 
be seen as a computationally challenging task since the probability of guaranteeing 
robustness requirement needs to be estimated for many values of uncertainty radius. 
However, as we demonstrate in the next section, using the sample reuse principle one 
can construct the robustness curve for virtually the entire scope of uncertainty range 
(0, 00) with absolutely bounded average computational requirements, regardless of 
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Figure 2. Comparison of Controller Alternatives. 

the size of the grid. For example, we shall show that for an uncertainty range as 
large as (10~^°, 10"'^°), in the average, one needs less than 50 times memory and 
computational resources than those needed to evaluate the uncertainty range (1, e) 
with the same resolution. 

3. Equivalent Number of Grid Points 

Throughout this paper, we assume that the uncertainty sets are homogeneous 
star-shaped (e.g., [2]). That is, the uncertainty bounding set with radius r is 
Br ~ {rX \ X & Bi] where Bi denotes the uncertainty bounding set such that 
cX E Bi for any X E Bi and any c E [0,1]. Clearly, most of the commonly 
used uncertainty bounding sets such as the Ip balls and spectral norm balls are 
homogeneous star-shaped. 

We shall consider the problem of constructing the robustness curve for arbitrary 
robustness requirement P under such assumption of uncertainty sets. Convention- 
ally, the robustness curve for a range of uncertainty radii [j, a] with o > 0, A > 1 
is constructed by choosing a set of grid points j — ri < r2 < ■ ■ ■ < rm — a and, 
for every grid point, performing N i.i.d. Monte Carlo simulations. Hence, the total 
number of simulations is a deterministic constant mN . To reduce computational 
complexity, we shall make use of the following intuitive concept: 

Let X be an observation of a random variable with uniform distribution over 
Bp 3 Br such that X E Br- Then X can also be viewed as an observation of a 
random variable with uniform distribution over Br ■ 

In order to apply such concept, it is necessary to perform the simulation in a 
backward direction so that appropriate evaluations of the robust requirement for 
larger uncertainty sets can be saved for the use of later simulations on smaller 
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uncertainty sets [6j. The sample reuse principle allows a single simulation to be 
used for multiple radii. Thus, the actual total number of simulations is significantly 
reduced. In order to quantify this reduction we introduce the equivalent number of 
grid points (ENGP), m^q, defined as 

expected total number of simulations 
me, - . 

In our approach, the number of simulations required at uncertainty radius Tj, 
denoted by for i — 1, ■ ■ ■ , to, is a random number. The total number of simula- 
tions can be represented by the random variable n = X^I^i ^i- The expected value 
of the total number of simulations is E[n] = X^I^i ^["^i] where K[X] denotes the 
expectation of random variable X . Hence, we can formally define 

E[n] 

TO„„ — . 

' N 

Due to sample reuse, we can achieve a substantial reduction of simulations, i.e., 
E[n] << mN. To quantify the reduction of the computational effort, we have 
introduced the notion of sample reuse factor [6] , which is defined as 

, . dof mN _ m_ 

reuse — TT-i r t — 

E[n] TOeg 

In our approach, N i.i.d simulation results are collected for each grid point. 
Hence, the accuracy of estimation is the same as that of the conventional method. 
However, the average number of simulations in our approach is E[n], which is 
equivalent to the complexity of rueq grid points in the conventional scheme. As a 
direct consequence of Theorem 1 of [6], we have that, for any discretization scheme, 
•nieq is independent of the sample size TV. Moreover, we have the following general 
results. 

Theorem 1. Let d he the dimension of uncertainty parameter space. Then, for ar- 
bitrary gridding scheme, the equivalent number of grid points based on the principle 
of sample reuse is strictly bounded from above by 1 + d In A, i.e., 

TOcq < 1 + d In A. 

See Appendix A for a proof. By an "arbitrary" discretization scheme, we mean 
two things: i) the number of grid points can be arbitrarily large; ii) the grid points 
can be distributed arbitrarily over the specified range of uncertainty radius. 

A fundamental question of robust control is whether randomized algorithms have 
polynomial complexity. In light of the fact the cost of each simulation depends on 
problem cases, the computational complexity is usually measured in terms of the 
number of simulations. This theorem reveals the following important facts: 

(a) : The complexity is linear in the dimension of the uncertainty space. Thus 
our algorithms overcome the curse of dimensionality. 

(b) : The complexity depends linearly in the logarithm of the "relative" width, 
A, of the interval of uncertainty radii. This proves that our algorithms are 
capable of estimating the robustness function for a wide range of uncer- 
tainty. 

(c) : Our algorithms can arbitrarily reduce the grid error, while keeping the 
complexity strictly below a constant bound. 
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In order to illustrate these points, Figure [3] displays the variation of meq for 
various dimensions of the uncertainty space and for values of A up to A = 10^*^ 
corresponding to the uncertainty range (10~^°, 10^") (which may be deemed a good 
approximation to (0, oo)). Notice that even for dimensions as high as d= 1024 the 
equivalent number of grid points, TOe^, is very reasonable. 




Figure 3. Absolute Bounds for m^q (ENGP) (d = 2', i 1, • • • , 10). 



Finally in this section, we consider the case where we need to estimate P(r) for 
r G [7, U] where 7 > is a constant, and U is an estimate of the probabilistic 
robustness margin calculated by randomized algorithms. Clearly, J7 is a random 
variable. If U depends on samples which are independent of the samples generated 
from the uncertainty set with radius r € [l,U] we have the following result: 



For any gridding scheme, 
(3.2) rueq < I + d\n^^ . 



To prove (|3.2p . notice that E[E[X | Y\\ = W^X] for any random variables X and 
Y . Hence, by Theorem [1] 



,[meq I C/]] < E 









1 + dln- 


= 1 + rfE 
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. 7. 



< 1 + In 



E[C/] 
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where the last inequality is obtained from applying the Jensen's inequality to the 
concave function ln(.). 
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4. Error Control 

In addition to efficiency, another important issue in any numerical approacli is 
error control. This point has been emphasized in many control engineering prob- 
lems. For instance, when computing the iJoo norm of a system, a lower bound and 
an upper bound are obtained and is required that the gap between them be less 
than a prescribed tolerance. A similar situation arises in the computation of the 
structured singular value (/i). 

For the specific case of the estimation of the robustness function, there are two 
sources of error: i) the statistical sampling error due to the finiteness of the sample 
size, N (sample size error); ii) the discretization error due to the finite number 
of points in any partition. Control of the sample size error has been well studied 
and emphasized. Existing techniques include the Chernoff bounds [8], binomial 
confidence interval [71 [5], etc. However, we claim that control of discretization 
error is not sufficiently emphasized. In fact, one can argue that controlling the 
sample size error can be meaningless if the discretization error is not controlled. 
This will be the case, for example, for those situations where a risk at the level 
of a small e (e.g., e = 0.001) may be significant or unacceptable. How can any 
estimation be useful if the discretization error is not ensured to be less than the 
tolerance e ? 

In this section, we first introduce an interpolation result necessary to analyze 
error control methods. Afterward, we discuss two different schemes which insure 
a discretization error less than a given e € (0, 1). The first is a uniform partition 
whereby the uncertainty radius interval [j, a] is partitioned by m points 

(m — i)(X — 1) 

4.1 r, = a-^- '\ ' a, i = l,---,m. 

(m — 1)A 

In the second scheme we consider a geometric type partition of the form 



(4.2) 



ri = a 



z = 1, • • • , TO. 



For any partition of the uncertainty radius interval, we have the following linear 
interpolation results. 

Theorem 2. Given an arbitrary partition of the uncertainty radius interval [j, a] 
with J — ri < r2 < ■ ■ ■ < rm = a, define 

(r-r,)P(r,+i) + (r,+i-r) P(r,) 



\r) = 



gir) = in+1 -r)[ — ] + (r - ri) 



ri+i - ri 

-d 



ri+i 



Then, for all r G [r^, r.i+i], 

|P(r) - P*(r)| < 1 - ^(!lL < ±^r,^^ _ r,) 
r,;+i - ri 2ri 

where r^, € (ri,ri_[-i) is the unique solution of equation 

-d 



1+1 



ri 



-d 



1 



ri+1 



l]d 
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with respect to r, which can he solved by a bisection search. 

See Appendix B for a proof. As mentfoned before, these interpolation results will 
be used in the construction of a tight confidence band for the robustness function. 

Remark 1. To guarantee a prescribed tolerance e e (0, 1), the number of grid points 
must be larger than a certain number. It has been shown by Barmish, Lagoa and 
Tempo l2j that if 

. , 2{\-l)d 
(4.3) m > 1 + ^ ^ ^ 

then |P(r) — P(ri)| < e Vr G [r^, r^+i] for i = 1, • • • , m— 1. This bound shows that, 
for fixed error e, the complexity is polynomial. From another perspective, it also 
shows that the number of grid points and computational complexity tend to infinity 
as the tolerance tends to zero. For example, the robustness analysis problem for 
complex uncertainty of size 30 x 30 over an interval of uncertainty with A = 10, 
requires m > 3, 240, 000, 001 in order to guarantee e < 10^'''. The bound, however, 
does not account for the sample reuse principle. Using our approach the equivalent 
number of grid points for this case is bounded from above by 1 + 1800 x In(lO). 

The following result is our extension of the result by Barmish et al., cited above, 
and quantifies the advantage of using linear interpolation. 



(4.4) m = 2 



Theorem 3. Let 

{\-l)d 
2^ _ 

where [.J denotes the floor function. Then, for a uniform gridding scheme, 

|P(r)-P*(r)| <e Vre[r„r,+i] 

for 2 = 1, •••,m— 1. Moreover, the equivalent number of grid points is 

m-l / 1 \ 

moq(e) =m-y^ 1 - 



/ J I m— 1 



See Appendix C for a proof. 



Remark 2. We point out that when using linear interpolation the number of grid 
points given by (|4.4p is approximately j of the bound given by (|4.3p . 

We now analyze a discretization scheme whereby the partition of the uncertainty 
interval under study is defined by a geometric series. 

Theorem 4. For a geometric discretization scheme with 

In A 



m = 2 



and 



for i = \, ■ ■ ■ ,m, the following statements hold true: 
(I): 

|P(r)-P*(r)| <e Vre[r„r,+i], i = !,■■■, m-l. 
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(II): 



meq(e) = 1 + 1 + 



In A 



In 1 



2e\ 
d ) 



(III): 



^rcu.c>2 l + dlnA 



See Appendix D for a proof. 

Remark 3. Since l + (ilnA>>lin many situations, the sample reuse factor for 
the geometric discretization scheme may be written in a more elegant form. That 
is, 

1 

* 2^ 

which is inversely proportional to the tolerance of the discretization error. For 
example, to ensure that the discretization error is less than 10""*, which is a rather 
weak requirement for many applications, our algorithm reduces the computational 
effort by a factor of 5, 000 when compared to a conventional approach. 

The two discretization schemes considered here, and others, have bounded com- 
plexity, but the distributions of the total number of simulations are different. Hence 
it is reasonable to ask if there is a "best discretization." Our results indicate that 
the geometric scheme is generally more efficient, as shown by the comparison of 
grid points in Figure [3] and the comparison of ENGP in Figure \5[ 



Our uniform scheme 
Existing uniform scfieme 
Our geometric scfieme 



Tolerances 



Figure 4. Comparison of Number of Grid Points (A = 10^, d = 500) 
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10 ^ 10 10 ' 

Tolerance e 

Figure 5. Comparison of rrieq (A — 10^, d — 5, 1 + din A = 35.5388) 

5. The Difficulties of Conventional Data Structure 

Our previous sample reuse algorithm ^6] uses the same data structure as that 
of the conventional algorithm. That is, the data structure for implementing the 
algorithms is basically a matrix of fixed size. In such data structure, for each 
grid point r^, there is a record {ki,ni) where ki represents the number of cases 
guaranteeing (or violating) the robustness requirement among rii simulations. In 
the course of experiment, the number is increment from to sample size N. In 
the following two subsections, we demonstrate that the conventional data structure 
is not suitable for controlling the error due to finite gridding. 

5.1. The Issue of Data Processing. Clearly, the total number of records is 
exactly the number of grid points m. For the conventional method, to accomplish 
N simulations for each grid points, the total number of updating the data record is 
Nm. As illustrated in Section 4, to control the error due to finite gridding requires 
an extremely large number, m, of grid points even for moderate requirement of 
e. Therefore, Nm is usually a very large number. It can be shown that if the 
sample reuse algorithm employs the same data structure as that of the conventional 
method, then, for any gridding scheme with m grid points, the total number of 
times of updating the data record is also Nm. This is true because, for every time 
a record (ki,ni) is updated, the number can only be increased by 1, and the 
number must be N when the experiment is completed. To have a feeling that the 
data processing with the conventional data structure is a severe challenge, one can 
consider the example discussed in Remark 1 of Section 4. With m > 3, 240, 000, 001 
and normal sample size 10^ < N < 10^, it can be seen that Nm will be in the range 
of 3x 10'^'^ to 3x 10^^. This is an enormous burden for today's computing technology. 
For a modern computer with 1.9 GHz CPU and 256 M bytes RAM, it takes about 
20 seconds to execute lO'' times the command ^ + 1 written in the MATLAB 
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language. It can be reasonably inferred that updating the data record for 3 x 10^"^ 
times will take about 20 x 10^^ x 3 x 10^'^ seconds (i.e., about 700 days). 

5.2. The Issue of Memory Space. For the conventional data structure, the total 
number of records is to. To execute the sample reuse algorithm or the conventional 
one with such data structure, each record must occupy some physical addresses. 
Such addresses are necessary for storing and visualizing the outcome of simulations. 
Of course, to obtain the outcome simulations may require a much higher amount 
of computer internal memory to execute the algorithm. Since m is usually a very 
large number, the consumption of memory to store and visualize the output of 
simulation can be enormous. To illustrate, consider again the example discussed in 
Remark 1 of Section 4. Since a floating point number occupies 2 bytes, storing a 
tuple of the form {ki,ni) needs 4 bytes. For to > 3,240,000,001, the data record 
wiU consumes 4 x 3,240,000,001 w 13 x 10^ bytes (i.e., about 13 giga bytes) of 
RAM. Such requirement, just for visualizing the outcome of the simulations, is a 
challenging task even for modern computers. 



In the last section, we have shown that any algorithm using the conventional 
data structure suffer from the problems of the complexity of data processing and 
memory space. This is because, the sample size N is usually very large and the 
number, to, of points in the partition of uncertainty radius approaches infinity as the 
tolerance, e, approaches zero (see Theorem|3|). In this section, we shall demonstrate 
that, by introducing a dynamic data structure and a new sample reuse algorithm, 
the average requirement of memory and the computational effort devoted to data 
processing are absolutely bounded, independent of the tolerance, and well within 
the power of modern computers. 

6.1. Data Structure. In order to address the memory issue and minimize the 
effort devoted to data processing, an appropriate data structure is critical. The 
key idea is to make use of the observation that, for a set of consecutive grid points 
with identical records of simulation results, it suffices to store the information of 
the smallest and the largest grid points. To illustrate our techniques, we enumerate, 
in a chronicle order of generation, the samples generated from various uncertainty 
bounding sets as Xi,X2, ■ ■ ■ ■ When samples Xi, X2, ■ ■ ■ ,Xj have been generated, 
the state of the experiment is completely represented by functions s(i, j) and v{i, j), 
where 



6. New Techniques of Sample Reuse 



V 



k=l 



k=l 



with 




otherwise 




(6.1) 
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and 

if Xk E Br, and P is violated for 
otherwise 



Zf = 



for i = 1, - • • ,m and k — 1, - • • ,j. The reason we introduce variable by 
(|6.ip is that, for grid point r^, once N equivalent simulations are available, the 
subsequent simulations can be ignored. By the principle of sample reuse, s{i,j) 
and v{i,j) are, respectively, the accumulated numbers of samples and violations 
for uncertainty bounding set with radius r^. When the experiment is completed, 
we have n samples Xi,X2, - ■ ■ , Xn and 

™/ N v(i,n) 
s{i,n)^N, P(r,) = l-^^, i = l,---,m. 

It can be seen that s{i,j) is piece- wise constant (with respect to i) and there exists 
a matrix such that, for i — I, ■ ■ ■ , to. 



(6.2) s{i,j) = 



[S^]i,2 for [S^]i,i < i < [S^]i+i,i with 1 < £ < k - I; 
[S%,2 for [S%,i<i<m 



where k is the number of rows of S-' and denotes the element of matrix 

A in the i-th row and the j-th column. Roughly speaking, the first column of 
matrix records the indexes of grid points for which the accumulated numbers of 
samples are jumping to different values. The second column of matrix records 
the corresponding accumulated numbers of samples. 

Similarly, v{i,j) is piece-wise constant (with respect to i) and there exists a 
matrix such that, for i — 1, ■ ■ ■ ,m, 

.... i[V^]e.2 for [V^]i,i < i < [V^]i+i,i with l<^<r-l; 
(6.3) v{t,j) = i ■ , 

[[^■']n2 for [W]r,i <i<m 

where t is the number of rows of V'-> . Loosely speaking, the first column of matrix 
records the indexes of grid points for which the accumulated numbers of viola- 
tions are jumping to different values. The second column of matrix records the 
corresponding accumulated numbers of violations. 

In this paper, matrices and are, respectively, referred to as the matrix of 
sample sizes and the matrix of violations. At any stage that samples Xi , • • • , Xj 
have been generated, the status of the experiment is completely characterized by 
matrices 5^ , Both matrices are of two columns but of varying number of rows 
in the course of experiment. 

To save memory and data processing effort, we shall take advantage of the piece- 
wise constant property of the accumulated numbers of samples and violations. 
Hence, we shall construct matrices and when we have generated Xi, • • • ,Xj. 
As can be seen in the sequel, such matrices can be constructed recursively. Once we 
have 5'-' and , we can generate sample Xj+i and update S\ as S^^^ , V^^^ 
in accordance with equations ()6.2|) and ()6.3p . 



6.2. Sample Reuse Algorithm. In this section, we present our sample reuse 
algorithms as follows. 

Initialization: We initialize the matrices of sample sizes and violations as 
follows: 

C> Generate sample Xi uniformly from uncertainty set with radius r^- 
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<) Compute J such that Xi e for j < i < m and Xi ^ for 
1 < « < J- 1- 





<> Let 51 = [1 1] if J = 1 and 5^ = ^ :| if > 1- 

Let = [1 0] if l{Xi) = and = if I{Xi) = 1, where 
I{X) = 1 if the robustness requirement is violated for X and otherwise 
1{X) = 0. 

Sample generation: If < N then generate sample Xj+i uniformly 

from uncertainty set with radius r„i, otherwise generate sample Xj^i uni- 
formly from uncertainty set with radius [S^]k-i.i- 

Updating matrices: Update as S^^^ by the method described in Section 
ion If l{Xj+i) = then let V^+^ = , otherwise update V as V3+^ by 
the method described in Section fG. 2. 21 

Stopping criterion: The sampling process is terminated if 5^ has only one 
row and [S'-']i^2 = ^• 

6.2.1. Sample Sizes Tracking. In this section, we describe how to update the matrix 
of sample sizes. The key idea is to ensure condition (j6.2p . Let k be the number of 
rows of . We proceed as follows. 

Step (1): Compute an index j* such that Xj^i e for j* < i < m and 
Xj+i ^ Bn for 1 < i < J* — 1 (Note that explicit formulas for computing 
J* are available when using uniform or geometric grid scheme). 



Step (2): Modify as a temporary matrix 5*-'+^ based on the following three 
cases. 



Case (1): [S^e*.i < f < [S'^]^-+i.i for some F G {1, • • • , k - 1}; 
Case (2): f = [S^]i^,i for some £* e {!,■■■ , 4; 
Case (3): / > [S%'i. 



In Case (1), define S^^^ as a (k + 1) x 2 matrix such that 

= [S'ki, [S'+^Wia = 1 + [s^]l2, e^e* + !,■■■ ,k. 

In Case (2), define 5'^+^ as a k x 2 matrix such that 

[S'+\i = [S'ki, [S'+']e.2 = [S^]t2, £=!,■■■ ,e*-l 

= [S'ki, [S'+'k2 = 1 + [S^k2, £^ 

In Case (3), define 5*^+^ as a (k + 1) x 2 matrix such that 

[S'+\ 1 - [S'ki, [S'+^k2 = [s^i,2, ^ = 1, • • • , « 

[^'■ + ']«+l,l=/' [5^'+^]«+l,2 = 1 + 
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Step (3): Let k denote the number of rows of S^+^ . If < N then 

let S^^^ ~ S^^^, otherwise find index ii, by a bisection search such that 
[S^]e^-i,2 < N < [S^]i^,2 and define S^^^ as an x 2 matrix such that 

6.2.2. Violations Tracking. In this section, we describe how to update the matrix 
of violations in the case of I{Xj^i) = 1. The key idea is to ensure condition (|6.3p . 
Let K be the number of rows of . Let t be the number of rows of . Let j* be the 
index obtained in the process of updating such that Xj^i G for j* < i < m 
and Xj^i ^ Bn for 1 < z < j* — 1. We proceed as follows. 

Step (i): Identify the maximal index l such that the experiment for uncer- 
tainty radius has not been completed by the following method. 

<^ If [S'-']k,2 < N, then let l = k, otherwise find i by a bisection search 
such that [V^ki < [S%,i, [V^,+i,i > [S^.^i. 

Step (ii): Modify as a temporary matrix based on the following two 
cases. 

Case(a) : [S%,2 < N or [S%,2 - N, [V^l+i,! = [S^^i. 

Case(b) : [S'-']k,2 > N and the index l guarantees [V^^]t+i,i > [S'-']k,i. 

In Case (a), we define — . In Case (b), we define as a (r + 1) x 2 
matrix such that 

[v']e 1 = [y'ki, [y'k2 = [v'k2, i = k--- ,i 

[v\+^,, = [S'ki, [v'Ui.2 = [V'k2 

[^^Vi,i = [^'ki' [V'V+ia^[V'k2, t^i + k---,T. 

step (iii): Obtain V^'^^ by modifying based on the following three cases. 

Case (i): [V^k-i <f< [V^k+i,i for some T G {1, • • • , t - 1}; 
Case (n): j* [V-^J^m for some £* e {1, • • • , i}; 
Case (iii): f > [V^,,i. 

Let T be the number of rows of . In Case (i), define yj+i as a (t+1) x2 
matrix such that 

= [V'ku [V'+\2 = [v^t2, £ = 1, • • • ,r 

[^'■+']r+i,i = [V^+'k+i.2 = 1 + [V^k,2 

[y'^\+i.i = [v'ki, [V'^']i+u2 = i + [v^k2, £ = e* + i,---,i 

In Case (ii), define V^^^ as a r x 2 matrix such that 

[y'^\i^[y'ki^ [v^+\2 ^ [v^k2, e^i,---j*-i 
[v^+\,^[v^ku [v=+\2 ^ [v^k2, i^i + i,---,T. 
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In Case (iii), define V^^^ 



as a (t + 1) X 2 matrix such that 




6.3. Complexity of Data Processing and Memory. It can be seen that the 
memory requirement and the computation due to data processing are determined by 
the sizes of matrices and . To quantify the complexity, we have the following 
results. 

Theorem 5. For any j , the following statements hold true: 

(I) : The number of rows of matrix is no more than N ; 

(II) : The expected number of rows of matrix V'-> is no greater than 



with pq = sup{r | P(r) = 1}. 

See Appendix F for a proof. We now revisit the robustness analysis problem 
discussed in Remark 1 of Section 4 from the perspective of memory complexity. 
Assume that each data record {ki,ni) (or each row of V^) occupies 4 bytes of 
computer internal memory (RAM). As illustrated in Section 5.2, when using the 
conventional data structure, it takes 13G (giga bytes) of RAM to save the data 
and visualize the results. On the other hand, in our new algorithm, if the smallest 
proportion is = minj,g|-a P(r) > 0.999 and fi. < |, the RAM requirement will 
be equivalent to 



It can be seen that such requirement of memory is extremely low as compared to 
that of the conventional method. Theorem \E\ also reveals that the complexity of 
data processing is very low. 

6.4. Confidence Band. To be useful, any numerical techniques should provide 
a method for error assessment. Monte Carlo simulation is no exception. The 
following results allows us to construct confidence band for the robustness curve. 
Such post-experimental statistical inference can remedy the conservatism of a priori 
choice of sample size TV based on the Chernoff bound. In order to overcome the 
computational complexity of the Clopper-Pearson's confidence interval [9] , we have 
developed new methods to facilitate the construction of the confidence band. 



(6.4) 




h = max min A, — ) , 1 





X 10^ 
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Theorem 6. Let 6 € (0, 1). Let C{k) = ^ + | ^ " '^i+on'''^^ ^'^^ ^(^) = 
^ + I i-t+Vi+4^ Mi^ ^^^^ Q ^ 9 ^ iei 

Ki = N — v(i,n), i — 1, ■ ■ ■ ,m. 

Let<; = 1- 7^^. £>e/?ne P(r) = C + (1 - C) l^{K.,+i) + and P(r) = 

C + (1 -0 ^K,+i) - <;. Then 

Pr{P(r) < P(r) < P(r), Vr G [ri,r,;+i]} > 1 - J. 

See Appendix G for a proof. The family of intervals [F{r), P(r)], r e [a/A, a] is 
referred to as the confidence band. It is important to note that the confidence band 
can be efficiently constructed by making use of the piece-wise constant property 
of t>(i,n). It can be shown that the computational complexity of constructing the 
confidence band is also absolutely bounded. 

7. Conclusion 

It is possible to make a case for the statement that the probabilistic robust- 
ness analysis is essentially the study of the robustness function, especially about its 
probabilistic implications, efficient evaluation and computational complexity. We 
have addressed these issues in this paper. In particular, we have developed ran- 
domized algorithms which offer more insights for system robustness. We rigorously 
show that, in both aspects of computer running time and memory requirement, the 
complexity of such randomized algorithms is not only linear in the dimension of 
uncertainty space, but also surprisingly low. While the complexity of conventional 
method grows linearly with the number of grid points and the error due to inter- 
polation is not well controlled, our techniques completely resolve such issues. In 
short, our method guarantees accuracy and efhciency. 

Appendix A. Proof of Theorem [T] 
We first establish a basic inequality that will be used to prove the theorem. 
Lemma 1. For any x > 1, 

- +lnx > 1. 

X 



Proof. Let 
Then /(I) = 1 and 



f{x) = — h Ina;. 

X 



d fix) X — 1 



dx 

It follows that /(x) > 1, Vx > 1. □ 
Now we are in the position to prove Theorem [1] Observing that 

d m— 1 



we have 



'"(tt) -S'" 



i=l 
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Therefore, 



m — 1 / \ d 



In ( ^ 



E 



In 



rj + l 



ri+1 



Since (^"t^) > 1, * = 1, • ■ ■ > ~ li it foUows from Lemma [T] that 

hi( ^ I >1, i^l,- 



, m — 1. 



Hence, 



or equivalently, 



E 



n+i 



In 



> m — 1, 



r. 



Finally, by Theorem 1 of 6\ and the definition of mpq, we have 



m — 1 / \ d 

r,; 



m„n = TO 



E 



ri+i 



< 1 + d In A. 



Appendix B. Proof of Theorem [2] 

To prove the theorem, we need some preliminary results. It is derived in [2] that 
Jll < M when P(.) is differentiable. The following lemma indicates that the 
bound on the rate of variation of P(.) can be much tighter. 

Lemma 2. For arbitrary robustness requirement, 



\F{r + Ar) - P(r)| < 1 - 1 + 



Ar 



< - Ar 

r 



for any r > and any Ar > 0. 



Proof. Let Qr C Br be the set such that the robustness requirement is satisfied. 
Let 



h = 



VOl{Qr+Ar) VOl(Qr) 



VOl(Qr) VOl(Qr 



YOl{Br+Ar) yol(Br+Ar) ' VOl(Br+Ar) VOl(Sr) 

Let "\" denote the operation of set minus. Observing that Qr+Ar\Qr ^ I3r+Ar\l3r, 
we have yo\{Qr+Ar) — vol(Qr) < '^o\{Br+Ar) ~ vol(;B,.). Using this fact and the 
identity vol(S^) = r'^ vol(Si), we have 



vol(6, 



y+Ar) 



Ar 
r 



and 
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Therefore, |P(r + Ar) - P(r)| = \h + /2I < 1 - (l + ^) < jAr where the last 
inequahty follows from inequality 1— + < Va; > 0. To prove this 

inequality, we can define function h{x) 1 ~ (l + ^) ~ x > and check 



that h{0) = and ^ = ^ (l + f ) 



-{d+i) 



< 0, Vx > 0. 



□ 



We are now in the position to prove the theorem. It can be shown that 
(r,+i - r)[P(r) - ¥{n)] + {r - n)[P{r) - P(r,+i)] 



|P(r) -P*(r)| 



(B.l) 



< 



n+i - ri 

(r,+i - r)|P(r) - P(r,)| + (r - r,)|P(r) - P(r,+i)| 



By Lemma [2] and inequality (|B.1[) . we have 



(r)-P*(r)| < 



in 


fi-r) 




+ (r - n) 






n+i - n 


1 - 


9{r) 






n+i - 







1 - 



Note that 
where 

It can be verified that 

-d 



dg{r) 



dr 



l+fl--)d 
r 



<i>(r) - ^'(r) 



vl'(r) = ( - 



-Id 



<i>(r,+i) = l+ 1 



<1, 5'(r,) = l + 
— ) d>l, *(r,+i 



n:+l 



- 1 U> 1, 



dg{r) 



Ti+l 



-d 



< 1, 



dr 
dg{r) 



< 0, 



97" 



> 0. 



It can be checked that $(r) is a monotone increasing function of r and that \E'(r) is a 
monotone decreasing function of r. Hence, is a monotone increasing function 

of r. Moreover, there exists an unique e (r^, r^-i-i) such that ^^r^ = 0, i.e., 

r— 

<I>(r*) = ^'(r*). Furthermore, g{r) is a convex function of r. Consequently, 

min g{r)^g{r^) 

re[ri, ri + ij 



and we have shown 

|P(r) -P*(r)| < 1 



n+1 - Ti 



Since ^^^i is a monotone increasing function of r, we can compute r^, by a bisection 



search over interval (r^, ^i+i). 
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By Lemma [2] and inequality (jB.ip . we have 

(ri+i - r)(r - r^)^ + (r - ri){n+i - r)^ 



|P(r)-P*(r)| < 



n+i - r, 
d 



< 



{n+i ~ r)(r ~ri)f- + {r~ ri){r,+i ~ r)f- 



n+i - Ti 

< —7 r max [Ti+i - r){r - r^) 

ri[ri+l - Ti) re[ri,ri+i] 



2d {ri+i - Ti) 



2 



niri+i-Ti) 4 
_ d[ri+i - rj) 

Appendix C. Proof of Theorem [3] 
By Theoremll |P(r) - P*(r)| < , Vr e [r,;,ri+i]. Thus, it suffices to 

show ^^^^l^'^'^^ < e, i.e., 

(C.l) — <1 + T- 



By defimtion (|4.ip . for i = l,---,m— 1, 



_ (m-»-l)(A-l) 
_ " (m-l)A " 

(m — 1)A 



< 1 



m-l + (A-l)(ii-l) 
A- 1 



m — 1 

By virtue of (jC.ip . to guarantee that the gridding error is less than e, it suffices to 
ensure 1 + ^— ir < 1 + 2f i.e., m > 1 + '^^^'^^^ . Hence, it suffices to have 

(A - l)d 



TO > 2- 



2e 



It can be verified that 



n 1 

= 1 ~ m-l , 1=1,---,TO-1. 

By Theorem 1 of [6] , the sample reuse factor is given by 

reuse — ~ 



m - 



^m — 1 



+ 1 

TO 



Em-1 / 1 1 



Therefore 



m — 1 

TO 



TOoq(e) = — = TO ■ 

rniisr; 
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Appendix D. Proof of Theorem [4] 
By virtue of (|4.2p . we have — X'^^^ . Hence, by (|C.1|) . it suffices to sliow 
A™-i < f + which can be reduced to to > f H /" s ■ This inequahty is 

By equation (|3.f p and Theorem f of , we have 



equivalent to to > 2 - 
E[n] = TO — (m — 1) 



In A 



and hence obtain moq(e). Note that 



reuse — 



TOeq(e) 



> 



2 + 


In A 




Li"(i+f)J 



In A 



1 + d In A 



f + d In A 



> 



f + d In A " 



Making use of the inequahty ln(l + x) < x, > 0, we have In (l + ^) < ^. 
Therefore, 

In A 



^r< 



> 



1 + d In A 



1 



1 + d In A 



Appendix E. Proof of Theorem [5] 

Proof of statement (I): Obviously, [S'-']i,2 > 1, [S^]n.2 < N. From the 
rules of sampling, we can perform induction with respect to j and have 
[S^]e+i,2 - [S^]e,2 > 1, ^ = 1, • • • , K - 1. Observing that 

K-l 

[S'].,2 = [S']l,2+Y.iiS']i+1.2-[S']i,2) 



K-l 



> i+;^([5-'],+i,2-[5^k,2) 

> 1 + K - 1 



we have k < [S^]k.2 < A. 
Proof of statement (II): We need some preliminary results. 



Lemma 3. Let 1 < i < m — 1. Then 



1 



Ji+i 
Proof. Note that 

1 - 



d{ri+i - n) 



< 



d{d - 1) / r^+i - r, 



d(n+i - n) 



{i-ty -{1-dt) 



where t = I-tU-J-L, It can be checked that 



{l-tf-{l^dt) 



d(d-l),2 
2 



for d = 1, 2. For d > 3, by Taylor's expansion formula, there exists ^ G (0, t) 
such that 

il-ty=l-dt+ _ ^)rf-2 
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Observing that < t < 1 since < n < n+i, we hence have < (1 — 
C)<^-2 < 1 and (1 -tf-{l-dt) < ^^^t'^ for d > 3. Therefore, for 
any d> 1, 



d{ri+i - n) 



^ d{d-i) ^^ _ d{d-i) fn+i 



□ 



Lemma 4. Define the maximum gap between grid points as 



vj = max (rj+i - r,). 

l<i<m— 1 



Then 

E 

Proof. Note that 



™ ^ 'TiJ^X—TiS^ A(A — l)ci7 



n+1 



< 



m— 1 / \ 2 m— 1 m— 1 



By successive cancelation, 

m— 1 ^ 

- J'i) = - ri = a - -. 

Hence, 



2 



m— 1 / \ 2 m— 1 



^ / n+i - rA ^ ^ r-j+i - ^ a - f ^ A(A - l)m 



□ 



Lemma 5. The expected number of rows of the matrix of violations is 
no greater than 1 + NPe{a) + 2N ^e(r-j) 1 - (77^) 



Proof. Let Xf, ■ ■ ■ . X^^ be the samples generated from uncertainty set with 
radius rj. Let y/ = 1{X^), i = I,-- - ,nj. By the principle of sam- 
ple reuse, the value of nj depends only on the samples generated from 
uncertainty sets with radius r^, j + 1 < fc < m. Consequently event 
{iij = v} is independent of event {Y? = 1} and Pr{i^"' = 1, = i/} = 
Prjl^-' = l}Pr{nj = u}. By the definitions of Y- and Pe{-), we have 



PROBABILISTIC ROBUSTNESS ANALYSIS 



23 



(E.l) 



Pr{r/ = 1} - 1 - 

- n, 



< Pe{rj). Therefore, 



N u 



^^Pr{r/ = 1, n, =^} 

i/=l i=l 
N V 

^^Pr{i;^^l}Pr{n,-4 



N 

< ^z/Pe(rj)Pr{n, =4 
i/=i 

= Pe(r,)E[n,] 

for j = 1, • • • , TO. We now consider y"' with n = Y^^i ^i- ^^e mecha- 
nism of the sample reuse algorithms, for j = 1, • • • , to — 1, every new sample 
from uncertainty set with radius rj at most creates 2Yf , i — 1, ■ ■ ■ , rij new 
rows for the matrix of violations (see Section 6.2.2). Note that Xi create 
at most 1 + Y]™ rows for . Every new sample from uncertainty set with 
radius at most creates Y^"^ , i = 2, ■ ■ ■ , n,„ new rows for the matrix of 
violations. Hence 



E[The number of rows of matrix V 

m-l rij 



< 1 + E 



■2E 



_j=i i=i 

711—1 

< 1 + P,{rm) E[n„] + 2 ^ Pe(r-j) E[n, 

3 = i 



By Lemma 6 of we have 



E[n,- 



, TO — 1. 



By (jE.ip and using the fact that E[n„i] — N, Pe{rm) — Pe{a.), we have 



E[The number of rows of matrix V^] < l+NPeia)+2N ^ Peir^] 



1 - 



□ 



Lemma 6. For any grid scheme, 



E 



^E 



Pe{ri+i){ri+i - ri) 



did — 1)A(A — l)zD Xd ^ , , , . \d ^ , , , 

n ^ — ^ n ^ — ^ 



2a 



1=1 
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Proof. Note that 

m — 1 



2^1 

m — 1 

^ E 

rn— 1 

+ E 

m — 1 

< E 



1 - 



m — l 

^E 



Pe(ri+i)(rj+i - Ti) 



ri+i 



-d 



Pe{ri){ri+i - n) 



^ Pe{ri){r^+i - r,) _ ^ Pe(n+i)(T'i+i - n) 



1=1 



1 - 



d 



din+i ~ ri) 



Xd 



m—l 



H [Pe{ri+i){ri+i - ri) - Pe(»'i)(fi+i - rt)] 

n ^ — ^ 



where the last inequahty follows from the facts that < Pe{ri) < Pe{ri+i) < 
1 and n+i > J. Making use of Lemma [3] and Lemma [U we have 



E ^^(^^) 



1 



^E 



Pein+i)iri+i - n) 



< 



d[d- 1) I Ti+i - Ti 



E 



y^Pe{ri+i){r^+i~ri) Pe{n){ri+i - n) 

i=l i=l 



1)A(A ~ Ad \^ \ p / \t \ 

< ■- 1 > Pe[ri+i)[n+i - r^) } Pe(ri)(ri+i - n). 

i=l i=\ 

□ 



2a 



Lemma 7. For a set of grid points Q — {r^ | 1 < ^ < w} with ^ = ri < 
r2 < ■ ■ ■ < fm = a, define function such that 



m — l 



HQ) - E P-^"-^) 



1 



Then for any two sets of grid points Qi and G2 such that Qi C G2, 

Proof. Consider two sequences of grid points Qi — {ri \ 1 < i < m} and 
G2 ^ {re \ I < e < m + 1} such that 

a a ^ ^ ^ 

- = ri < r2 < • • • < r„ = a, - ri < ra < • • • < r^+i = a, 

and that G2 is obtained from Qi by adding a grid point fi+i to interval 
(ri,ri+i) where 1 < i < m — 1, i.e., — rj, j — I, ■ ■ ■ ,i and fj+i — 
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^j) J = ^ + Ij ■ ■ ■ ; "T'- By the definition of function we have 



H(e2)-H(5i) 
1 



T = l 



1 



r-T+1 



m — 1 



n+1 



+ Pe(ri+l) 



n+1 



By virtue of the fact that Pe(^j+i) > -Pe(n), we have 



^(^2) - HGi) 



1 - 



n+i 



Pein) 



i+1 

n+ij \n+i 



1 - 



+ 



n+i 



-Pein) 



i+l 



n+i 



Recall that rj < fi+i < r^+i , we have 



'i+l 'i 'i+l 'i 



pi 



+1 



(^+i-r-f)(rf+i-ff+i) 



'i+l' i+l 



> 0. 



It follows that ^g2) - HGi) > 0. 



Wc are now in the position to prove statement (II) of the theorem. For 
any set of grid points, we can reduce the maximal gap between grid points 
by adding grid points. Every new grid point is placed at the middle of 
one of the previous intervals which possess the largest width in order to 
ensure that, as more grid points added, the maximal gap of grid points 
tends to zero. In this process, we create a series of nested sets of grid 
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points Gk, k — 1,2, ■ ■ ■ ,00 such that Gi C G2 C Gs C ■ ■ ■ . Note that 



m — 1 



< 



m — 1 



-d 



" Pe{x) 



dx 



^E 



-Pe(7-i+l)(n+i - r^) 



1=1 



Ti+l 



^E 



1=1 



n+1 



-da; 



2) 



d(d — 1)A(A — l)ti7 \d "-^—^ ^ , , , . 

< ^ — V — + — E '^^+1 '^^+1 - ^0 



Ad 



2a 

m— 1 



n ' ^ 



+d 



i=l 
m—l 



E 



Pe{rt+i){ri+i - n) Pe{x) 



-dx 



where inequahty (|E.2p foUows from Lemma [51 By Lemma [21 P(.) is a 
continuous function with respect to r. Consequently, -^-^^ is Riemann 
integrable over interval [f ,a] and 



lim > 



Pe{n+iKn+i - n) r Pe{x) 



Ti+l 



-dx. 



Moreover, since Pe{x) is Riemann integrable, we have 

m — 1 m—l pa 

lim E ^'e(?'i+i)(n+i - Ti) = lim E ^'e(''0(''i+i - = / Pe(x)dx. 

i=l i=l ■'a 

Hence, the right hand side of inequality (|E.2p can be made arbitrarily small 
by successively cutting the gap between grid points in half with new grid 
points. This proves that 



lim ^Gk) ^d r ^^dx. 

k^OO J a X 



On the other hand, by Lemma [71 we have < ^(^2) ^ ^(^3) ^ 

■ ■ ■ . Combining the convergency and the monotone property of sequence 
{^{Gk)}kLn we can conclude that ^(G) < d j't ^^r^'^^ ^'^^ of S^'i 

points G- By Lemma [SI the expected number of rows of the matrix of 
violations is no greater than 



1 + NPe{a) + 2Ni\{G) < 1 + NPe{a) + 2Nd 
= l + NPe{a) + 2Nd 



° Pe{x) 

L X 

" Pe{x) 



dx 
dx 
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for any Q. Such bound applies to any because the number of rows of 
is non-decreasing with respect to j. Finally, the inequality of (|6.4|) can be 
proved by making use of the observation that Pe{x) < Pe{a), Vx G a] . 

Appendix F. Proof of Theorem [H] 
We need the following lemma, which has recently been obtained in T]. 

Lemma 8. Let Xi, i = 1, • • • ,N be i.i.d. Bernoulli random variables such that 
Pr{X, = 1} = 1 - Pr{X, = 0} Px > 0. Let K = ^'^ ' . Then 
Vv{C{K) < Fx < U{K)} >l-5. 

Applying Lemma [H we have Pr{£(_ft'i+i) < P(r,+i) < U{K,+{)} > 1 - I and 
ViiCiKi) < ¥{r,) < U{Ki)} > 1 - |. Hence by the Bonferroni's inequality, 

Pr{£(if,+i) < P(r,+i) < U{K,+i), C{K,) < P(r,) < U{K,)} > I - 5. 

By the definitions of P*(r), P(r) andP(r), we have that event {C{Ki+i) < P(r,:+i) < 
U{Ki+i), C{Ki) < P{ri) < U{Ki)} implies event {P(r) + ^ < P*(r) < P(r)-?, Vr e 
[r„r,+i]}. Hence, Pr{P(r) + ? < P*(r) < P(r) - ^, Vr e [r„r,+i]} >l-5. By 
Theorem [2] and the gridding scheme, Pr{|P*(r) - P(r)| < i;, Vr e [r.t,r.t+i]} = 1. 
Applying Bonferroni's inequality, we have 

(F.l) Pr{P(r) +(r < P*(r) < P(r) - ^, |P*(r)-P(r)| < <r, Vr € [r,,r,+i]} > 1 - <5. 

Finally, the theorem is proved by observing that the left hand side of inequality 
(jF.ip is no greater than Pr{P(r) < P(r) < P(r), Vr e [rj,ri+i]}. 
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